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Abstract Craig interpolation is a widespread method for abstraction 
in SAT-based verification, with important applications such as Predi- 
cate Abstraction, CounterExample Guided Abstraction Refinement and 
Lazy Abstraction With Interpolation. Most state-of-the-art model check- 
ing techniques based on interpolation require collections of interpolants 
to satisfy particular properties, to which we refer as "collectives"; they 
do not hold in general for all interpolation systems and have to be es- 
tablished for each particular system and verification environment. Nev- 
ertheless, no systematic approach exists that correlates the individual 
interpolation systems and compares the necessary collectives. This paper 
proposes a uniform framework, which encompasses (and generalizes) the 
most common collectives exploited in SAT-based model checking. We use 
it for a systematic study of the collectives and of the requirements they 
pose on the interpolation systems. Our results have immediate practical 
applications to various verification tasks and provide a better theoretical 
overview of collectives in interpolation-based model checking. 



1 Introduction 

Craig interpolation is a popular approach in SAT-based verification [14113] with 
notable applications such as Predicate Abstraction [TU] , CounterExample Guided 
Abstraction Refinement (CEGAR) [7'\ , and Lazy Abstraction With Interpolation 
(LAWI) [T5], just to name a few. 

Formally given two formulae A and B such that A A B is unsatisfiable, a 
Craig interpolant is a formula I such that A implies /, / is inconsistent with B 
and / is defined over the atoms (i.e., propositional variables) common to A and 
B. It can be seen as an over-approximation of A that is still inconsistent with 
In model checking applications, A typically encodes some finite program 
traces, and B denotes error locations. In this case, an interpolant I represents a 
set of safe states that over-approximate the states reachable in A. 

In most verification tasks, a single interpolant, i.e., a single subdivision of 
constraints into two groups A and B, is not sufficient. For example, consider 



3 We write Itp(A \ B) for an interpolant of A and B, and I a when B is clear from 
the context. 



the refinement problem in CEGAR: given a spurious error trace it = ti, . . . , r n , 
where Tj is a program statement, find a set of formulae Xq, . . . , X n such that 
Xo = T, X n = _L, and for 1 < i < n, the Hoare triples {Xi^i} Ti {X{\ are 
valid. The sequence {Xi} justifies that the error trace is infeasible and is used to 
refine the abstraction. Assuming that the trace 7r is in Single Static Assignment 
(SSA) form, the solution to the refinement problem is a sequence of interpolants 
such that: Ii — Itp(r\ . . .Ti | r J+1 . . . r„) and li-\ A Ti =>■ Ii- That 
is, in addition to demanding that each Ii is an interpolant between the prefix 
(statements up to position i in the trace) and the suffix (statements following 
position i), the sequence {Ii} of interpolants must be inductive: this property is 
known as the path-interpolation property 17J. 

Other properties (e.g., simultaneous abstraction, interpolation sequence, path- 
, symmetric-, and tree- interpolation) can be seen in existing verification frame- 
works such as IMPACT [13], Whale pQ, FunFrog [ID] and eVolCheck [2D], which 
implement instances of Predicate Abstraction [DJ, Lazy Abstraction with In- 
terpolation [15] , Interpolation- based Function Summarization [21] and Upgrade 
Checking [20] , These properties, to which we refer as collectives since they con- 
cern collections of interpolants, are not satisfied by arbitrary sequences of Craig 
interpolants and must be established for each interpolation algorithm and verifi- 
cation technique. The problem with the current state of affairs in interpolation- 
based model checking is that there is no systematic approach that would correlate 
the individual interpolation algorithms and compare the necessary collectives. 

Contribution 1: This paper, for the first time, collects, identifies, and uni- 
formly presents the most common collectives imposed on interpolation by exist- 
ing SAT-based verification approaches (see {J2J. 

In addition to the issues related to a diversity of interpolation properties, it is 
often desirable to have flexibility in choosing different algorithms for computing 
different interpolants in a sequence {Ii}, rather than using a single interpolation 
algorithm (or interpolation system) Hps, as assumed in the path- interpolation 
example above. To guarantee such a flexibility, this paper presents a framework 
which generalizes the traditional setting consisting of a single interpolation sys- 
tem to allow for sequences, or families, of interpolation systems. For example, 
given a family of systems T = {Itp Si }i =1 , let I { = Itps^n, ...Ti | T i+1 . . . r„). 
If the resulting sequence of interpolants {Ii} satisfies the condition of path- 
interpolation, we say that the family J- has the path-interpolation property. 

Families find practical applicability in several contexts^]. One example is 
IMPACT-style verification, where it is desirable to obtain a path-interpolant 
{Ii} with weak interpolants at the beginning (i.e., I\,Ii, ■ ■ •) and strong inter- 
polants at the end (i.e., . . . , I n -i, I n )- This would increase the likelihood of the 
sequence to be inductive and can be achieved by using a family of systems of 
different strength. Another example is software Upgrade Checking, where func- 
tion summaries are computed by interpolation. Different functions in a program 

4 The notion of families is additionally a useful technical tool to make the discussion 
and the results more general and easier to compare with the prior work of CAV'12 
[17| (which formally defined families for the first time). 



could require different levels of abstraction by means of interpolation. A system 
that generates stronger interpolants can yield a tighter abstraction, more closely 
reflecting the behavior of the corresponding function. On the other hand, a sys- 
tem that generates weaker interpolants would give an abstraction which is more 
"tolerant" and is more likely to remain valid when the function is updated. We 
have implemented a framework that realizes the interpolation system of [1] inside 
the OpenSMT tool [3J, and we are currently carrying out experiments with the 
model checkers FunFrog and eVolCheck, which use OpenSMT for solving and 
interpolation. 

Contribution 2: This paper systematically studies the collectives and the 
relationships among them; in particular, it shows that for families of interpolation 
systems the collectives form a hierarchy, whereas for a single system all but two 
(i.e., path-interpolation and simultaneous abstraction) are equivalent (see §3J. 

Another issue which this paper deals with is the fact that there exist different 
approaches for generating interpolants. One is to use specialized algorithms: 
examples are procedures based on constraint solving (e.g., [18]). machine learning 
(e.g., [55]), and, even, pure verification algorithms like IC3 [5] and PDR |5j 
that can be viewed as computing a path-interpolation sequence. A second, well- 
known approach is to extract an interpolant of A A B from a resolution proof 
of unsatisfiability of A A B. Examples are the algorithm by Pudlak [TB] (also 
independently proposed by Huang [S] and by Krajfcek [H]), the algorithm by 
McMillan [T2|, and the Labeled Interpolation Systems (LISs) of D'Silva et al. [I], 
the latter being the most general version of this approach. 

The variety of interpolation algorithms makes it difficult to reason about their 
properties in a systematic manner. At a low level of representation, the challenge 
is determined by the complexity of individual algorithms and by the diversity 
among them, which makes it hard to study them uniformly. On the other hand, 
at a high level, where the details are hidden, not many interesting results can be 
obtained. For this reason, this paper adopts a twofold approach, working both 
at a high and at a low level of representation: at the high level, we give a global 
view of the entire collection of properties and of their relationships and hierarchy; 
at the low level, we obtain additional stronger results for concrete interpolation 
systems. In particular, we first investigate the properties of interpolation systems 
treating them as black boxes, and then focus on LISs. In the paper, the results of 
§3] apply to arbitrary interpolation algorithms, while those of §U apply to LISs. 

Contribution 3: For the first time, this paper gives both sufficient and neces- 
sary conditions for a family of LISs and for a single LIS to enjoy each of the col- 
lectives. In particular, we show that in case of a single system path-interpolation 
is common to all LISs, while simultaneous abstraction is as strong as all other 
(more complex) properties (see §4} . Concrete applications of our results are also 
discussed in SJU 

Related Work. To our knowledge, despite interpolation being an important 
component of verification, no systematic investigation of verification-related re- 
quirements for interpolants has been done prior to this paper. One exception is 
the work by the first two authors [U] , that studies a subset of the properties in 



the context of LISs. This paper significantly extends the results of that work by 
considering the most common collectives used in SAT-based verification, at the 
same time addressing a wider class of interpolation systems. Moreover, for LISs, 
it provides both the necessary and sufficient conditions for each property. 

2 Interpolation Systems 

This section introduces the basic notions of interpolation, and then proceeds 
to formulate the various collectives exploited in verification. We use the stan- 
dard convention of identifying conjunctions of formulae with sets of formulae 
and concatenation with conjunction, whenever convenient. For example, we in- 
terchangeably use {0i, ■ ■ ■ , 0n} and 0i • ■ • n for 0i A . . . A n . 
Interpolation System. An interpolation system Itps is a function that, given 
an inconsistent <P = {0i,02}j returns a Craig's interpolant, that is a formula 
huS = Hps(<t>i I 4>2) such that: 

01 => I<jj u s ^i,sA02 =>■ -L £r*i,s - A>i n £<t>2 

where C^, denotes the atoms of a formula 0. That is, I^ t .s is implied by 0i, is 
inconsistent with 02 and is defined over the common language of 0i and 02 ■ 

For <P = {0i, . . . , 0„}, we write I^.-.^s to denote Itps{<fa ■■■ < t>i I 4>i+i ' • ' <j>n)- 
W.l.o.g., we assume that, for any Itps and any formula 0, Itps(T | 0) = T and 
Itps(4> | T) = _L, where we equate the constant true T with the empty formula. 
We omit S whenever clear from the context. 

An interpolation system Itp is called symmetric if for any inconsistent 
<P = {0i,02}: Itp{(px | 02 ) < 5=^ Itp((f>2 | 0i ) (we use the notation for the 
negation of a formula 0) . 

A sequence T — {Itps 1 , ■ ■ ■ , Itps„ } of interpolation systems is called a family. 

Collectives. In the following, we formulate the properties of interpolation sys- 
tems that are required by existing verification algorithms. Furthermore, we gen- 
eralize the collectives by presenting them over families of interpolation systems 
(i.e., we allow the use different systems to generate different interpolants in a 
sequence). Later, we restrict the properties to the more traditional setting of the 
singleton families. 

n-Path Interpolation (PI) was first defined in [S], where it is employed in the 
refinement phase of CEGAR-based predicate abstraction. It has also appeared 
in [23) under the name interpolation- sequence, where it is used for a specialized 
interpolation-based hardware verification algorithm. 

Formally, a family of n + 1 interpolation systems {Itps , ■ ■ ■ , Itps n } has the 
n-path interpolation property (n-PI) iff for any inconsistent <P = {0i,...,0 n } 
and for < i < n — 1 (recall that It = T and = _L): 

(^...^,S|A^+i) => I<t>i...4> t+ i,S i+1 

n- Generalized Simultaneous Abstraction (GSA) is the generalization of 
simultaneous abstraction, a property that first appeared, under the name sym- 
metric interpolation, in [TU], where it is used for approximation of a transition 



relation for predicate abstraction. We changed the name to avoid confusion with 
the notion of symmetric interpolation system (see above) . The reason for gener- 
alizing the property will be apparent later. 

Formally, a family of n + 1 interpolation systems {Itps 1 , ■ ■ ■ , Itps n+1 } has the 
n- generalized simultaneous abstraction property (n-GSA) iff for any inconsistent 
<P = {0i,...,0„+i}: 

n 

/\I<j>i,Si => ^i...0„,S„ + i 

1=1 

The case n = 2 is called Binary GSA (BGSA): I<j 3lt s 1 A I<t, 2 ,s 2 I<f>i<f>2,s 3 - 
If (f>n+i = T, the property is called n- simultaneous abstraction (n-SA): 
Ki=i I 4>i,s t =>• -L(= ^i...0„,S„+i) and, if n = 2, Smary SA (BSA). In n-SA 
Itps n+1 is irrelevant and is often omitted. 

n- State- Transition Interpolation (STI) is defined as a combination of PI 
and SA in a single family of systems. It was introduced in [1] as part of the inter- 
procedural verification algorithm Whale. Intuitively, the "state" interpolants 
over-approximate the set of reachable states, and the "transition" interpolants 
summarize the transition relations (or function bodies). The STI requirement 
ensures that state over-approximation is "compatible" with the summarization. 
That is, {I t p 1 --- t p i ,s i }Ici>i +1 ,T i+1 {I<t> 1 ---<t> i+1 ,s i+ i} is a valid Hoare triple for each i. 

Formally, a family of interpolation systems {Itps , ■ • • , Itps n , ItpT x , ■ • ■ , Hpr n } 
has the n- state-transition interpolation property (n-STI) iff for any inconsistent 
<P = {0i, ...,<j>n} and for < i < n — 1: 

(I<t>i...<k,Si Al<k +1 ,T <+ i) =>■ ■fyi...<k + i,S i+ i 

T-Tree Interpolation (TI) is a generalization of classical interpolation used in 
model checking applications, in which partitions of an unsat formula naturally 
correspond to a tree structure such as call tree or program unwinding. The 
collective was first introduced by McMillan and Rybalchenko for computing post- 
fixpoints of relational equations (e.g., used in analysis of recursive programs). It 
is currently supported by iZ3 - an interpolating version of ZS0. It is also at the 
base of the nested-interpolants of Heizmann et al. [B] . 

Formally, let T = (V, E) be a tree with n nodes V = [1, . . . , n], A family of 
n interpolation systems {Itps 1 , ■ ■ ■ , Itps n } has the T-tree interpolation property 
(T-TI) iff for any inconsistent <P — {<f>i, . . . , <fi n }; 

f\ If^Sj A0 4 => lF it S t 

where Fi — {<fij | i C j}, and j C j iff node j is a descendant of node i in T. 
Notice that for the root i of T, Fi — <P and lF it Si = -L- 

An interpolation system Hps is said to have a property P (or, simply, to 
have P), where P is one of the properties defined above, if every family in- 
duced by Hps has P. For example, Itps has GSA iff for every k the family 
{Itps 1 , • ■ ■ , Itps k }, where Itpsi = Itps for all i, has fc-GSA. 

5 http: //research. microsoft . com/en-us/um/redmond/projects/z3/iz3documentation.html 



3 Collectives of Interpolation Systems 

In this section, we study collectives of general interpolation systems, that is, we 
treat interpolation systems as black-boxes. In section §3]we will extend the study 
to the implementation-level details of the LISs. 

Collectives of Single Systems. We begin by studying the relationships among 
the various collectives of single interpolation systems. 

Theorem 1. Let Itps be an interpolation system. The following are equivalent: 
Itps has BGSA (1), Itp s has GSA (2), Itp s has TI (3), Itp s has STI (4). 

Proof. We show that 1 -> 2, 2 ->• 3, 3 -)• 4, 4 ->■ 1. 

(1 — > 2) Assume Itps has BGSA. Take any inconsistent <P = {fa, . . . , 4> n +i}- 
Then, for 2 < i < n: (I^ 1 ... (j)t _ 1 M (j)i ) Ifa...^, which together yield (ALi ^>J 
I<j>i_..4n- Hence, Itps has GSA. 

(2 — ► 3) Let T = ([1, . . . , n], E), take any inconsistent <P — {fa, . , . , n }. 
Since Itps has GSA: (/\^ ^ eB Ip j A-T^J =>■ Ip t , and, from the definition of Craig 
interpolation, fa ^ I ( f >i . Hence, Itps has T-TI. 

(3 — > 4) Take any inconsistent ^ = {fa, . . . , cf> n } and extend it to a $' by 
adding n copies of T at the end. Define a tree T$ti — ([1, • • • > 2n], E) s.t.: 
E = {(n + i,i) | 1 < i < n}U{(n + i,n + i-l) \ 1 < i < n}. Then, for 1 < i < n, 
Fi = {fa} and F n+ i = {fa, . . . , fa}, where Ft is as in the definition of T-TI. By 
the T-TI property: (lF n+i A Ip i+1 A T) => lF n+i+1) which is equivalent to STI. 

(4 — > 1) Follows from STI being syntactically equivalent to BGSA for i = 1. 

Theorem Q] has a few simple extensions. First, GSA implies S A directly from 
the definitions. Similarly, since <\> =>■ 1$, STI implies PI. Finally, we conjecture 
that both SA and PI are strictly weaker than the rest. In £g] (Theorem ITB|) . we 
show that for LISs, PI is strictly weaker than SA. As for SA, we show that it 
is equivalent to BGSA in symmetric interpolation systems (Proposition [1] in the 
appendix). But, in the general case, the conjecture remains open. 

These results define a hierarchy of collectives which is summarized in Fig. [T] 
where the edges indicate implications among the collectives. Note that SA — > 
GSA holds only for symmetric systems. 

In summary, the main contribution in the setting of a single system is the 
proof that almost all collectives are equivalent and the hierarchy of the collectives 
collapses. From a practical perspective, this means that McMillan's interpolation 
system (implemented by most interpolating SMT-solvers) has all of the collec- 
tive properties, including the recently introduced TI. 

Collectives of Families of Systems. Here, we study collectives of families 
of interpolation systems. We first show that the collectives introduced in 321 
directly extend from families to sub- families. Second, we examine the hierarchy 
of the relationships among the properties. Finally, we conclude by discussing the 
practical implications of these results. 

Collectives of Sub- families. If a family of interpolation systems J- has a 
property P, then sub-families of T have P as well. We state this formally for 



fc-STI (since we use it in the proof of Theorem similar statements for the 
other collectives are discussed in the appendis0. 

Theorem 2. A family {Itps , ■ ■ ■ , Hps n , HpTx > ■ ■ ■ > ItPT n } has n-STI iff for all 
k < n the sub-family {Itps , ■ ■ ■ , Hps k } U {Itp^ , ■ ■ ■ , Itpr k } has k-STI. 

Relationships Among Collectives. We now show the relationships among 
collectives. First, we note that n-SA and BGSA are equivalent for symmet- 
ric interpolation systems. Whenever a family T = {Itps t , ■ ■ ■ , Itps n+1 } has 
(n + 1)-SA and Itps n+1 is symmetric, then T has n-GSA (Proposition [2] in the 
appendix, which is the analogue of Proposition Q] for single systems). 

In the rest of the section, we delineate the hierarchy of collectives. In par- 
ticular, we show that T-TI is the most general collective, immediately followed 
by n-GSA, which is followed by BGSA and n-STI, which are equivalent, and at 
last by n-SA and n-PI. The first result is that the n-STI property implies both 
the n-PI and n-SA properties separately: 

Theorem 3. If a family T = {Itps n , . . . , Itps n , Hpr-i , ■ ■ ■ , Itpr„ } has n-STI 
then (1) {Itps , ■ ■ ■ , Itps n } has n-PI and (2) {ItpT ± , • ■ ■ , HpT n } has n-SA. 

A natural question to ask is whether the converse of Theorem [3] is true. That 
is, whether the family T\ U T% that combines two arbitrary families T\ and Ti 
that independently enjoy n-PI and n-SA, respectively, has n-STI. We show in 
Theorem fTTT that this is not the case. 

As for BGSA, the n-STI property is closely related to it: deciding whether 
a family T has n-STI is in fact reducible to deciding whether a collection of 
sub-families of T has BGSA. 

Theorem 4. A family T = {Itps , . . . , Itps n , Itpr-y , HpT„ } has n-STI iff 
{ItpSi, Itpx i+1 , Itps i+1 } has BGSA for all < i < n— 1. 

From Theorem U and Theorem [3] we derive: 

Corollary 1. If there exists a family {Itps al ■ ■ ■ , Itps n } U {ItpT t , ■ ■ ■ , ItpT n } 
s.t. {ItpSi , ItpT i+1 , Itps i+1 } has BGSA for allO < i < n — 1, then {Itp^ , ■ ■ ■ , ItpT„ } 
has n-SA. 

We now relate T-TI and n-GSA. Note that the need for two theorems with 
different statements arises from the asymmetry between the two properties: all 
4>i are abstracted by interpolation in n-GSA, whereas in T-TI a formula is not 
abstracted, when considering the correspondent parent together with its children. 

Theorem 5. Given a tree T = (V,E) if a family T — {Itps^iev has T-TI. 
then, for every parent ik+i and its children i\, . . . , i^: 

1. If ik+i is the root, {Itps it , • ■ ■ , ItpSi } has k-SA. 

2. Otherwise, {Itps il , . . . , Itps ik j Hps ik ± } has k-GSA. 



All proofs can be found in the appendix. 



Theorem 6. Given a tree T = (V,E), a family T = {Itps^i^v has T-TI if, 
for every node ik+i and its children i%, . . . ,ik, there exists Ti k+1 such that: 

1. If ifc+i is the root, {Itp Sil Hps ik , It PT, k+1 } has (k + 1)-SA. 

2. Otherwise, {Itps H , ■ ■ ■ , Itpr ik+1 , Hps ik+1 } has (k + 1)-GSA. 

An important observation is that the T-TI property is the most general, in the 
sense that it realizes any of the other properties, given an appropriate choice 
of the tree T. We state here (and prove in the appendix) that n-GSA and n- 
STI can be implemented by T-TI for some Tq SA and TS TI ; the remaining cases 
can be derived in a similar manner. Note that the converse implications are not 
necessarily true in general, since the tree-interpolation requirement is stronger. 

Theorem 7. If a family T = {Itps n+1) Itps 11 ■ ■ . , Itps n+1 } has Tq SA -TI, then 
{Itps 1 , ■ ■ ■ ,Itps n+1 } has n-GSA. 

Theorem 8. // a family T = {Itpg Q , . . . , Itps n } U {ItpT ± , • • ■ , Itpr n } has Tg TI - 
TI, then it has n-STI. 

The results of so far (including Theorem [11] of @ define a hierarchy of collec- 
tives which is summarized in Fig. O The solid edges indicate direct implication 
between properties; SA — > GSA requires symmetry, while GSA — > TI requires 
the existence of an additional set of interpolation systems. The dashed edges 
represent the ability of TI to realize all the other properties for an appropriate 
tree; only the edges to STI and GSA are shown, the other ones are implicit. 
The dash-dotted edges represent the sub-family properties. 

An immediate application of our results is that they show how to overcome 
limitations of existing implementations. For example, they enable the trivial 
construction of tree-interpolants in MathSalQ (currently only available in iZ3) 
- thus enabling its usability for Upgrade Checking [50] - by reusing existing 
BGSA-interpolation implementation of MathSat. Similarly, our results enable 
construction of BGSA and GSA interpolants in iZ3 (currently only available in 
MathSat) - thus enabling the use of iZ3 in Whale. ^^~~~T\ — 




Figure 1: Single systems collectives. Figure 2: Families of systems collectives. 
4 Collectives of Labeled Interpolation Systems 

In this section, we move from the abstract level of general interpolation systems 
to the implementation level of the Labeled Interpolation Systems. After intro- 
ducing and defining LISs, we study collectives of families, then summarize the 
results for single LISs, also answering the questions left open in $3] 

7 http://mathsat.fbk.eu/ 
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Figure 3: Labeled interpolation system ifpi. 
There are several state-of-the art approaches for automatically computing in- 
terpolants. The most successful techniques derive an interpolant for A A B from 
a resolution proof of the unsatisfiability of the conjunction. Noteworthy exam- 
ples are the algorithm independently developed by Pudlak [IB], Huang [5] and 
Kraji'cek and the one by McMillan [12j . These algorithms are implemented 
recursively by initially computing partial interpolants for the axioms (leaves of 
the proof), and, then, following the proof structure, by computing a partial in- 
terpolant for each conclusion from those of the premises. The partial interpolant 
of the root of the proof is the interpolant for the formula. In this section, we 
review these algorithms following the framework of D'Silva et al. [3]. 

Resolution Proofs. We assume a countable set of propositional variables. A 
literal is a variable, either with positive (p) or negative (p) polarity. A clause C is 
a finite disjunction of literals; a formula <P in conjunctive normal form (CNF) is a 
finite conjunction of clauses. A resolution proof of unsatisfiability (or refutation) 
of a formula <P in CNF is a tree such that the leaves are the clauses of the 
root is the empty clause _L and the inner nodes are clauses generated via the 
resolution rule (where C + V p and C~ V p are the antecedents, C + V C~ the 
resolvent, and p is the pivot): 

C7+Vp C~Vp 

c+vc- 

Labelings and Interpolant Strength. D'Silva et al. [1] generalize the algo- 
rithms by Pudlak [IB] and McMillan [12] for propositional resolution systems by 
introducing the notion of labeled interpolation system (LIS), focusing on the con- 
cept of interpolant strength (a formula (f> is stronger than ip whenever <fi =4> ip) . 

Given a refutation of a formula A A B, a variable p can appear as a literal 
only in A, only in B or in both; p is respectively said to have class A, B or AB. 
A labeling L is a mapping that assigns a label among {a, 6, ab} independently 
to each variable in each clause; we assume that no clause has both a literal and 
its negation, so assigning a label to variables or literals is equivalent. The set 
of possible labelings is restricted by ensuring that class A variables have label a 
and class B variables label b; AB variables can be labeled either a, b or ab. 

In g], a labeled interpolation system (LIS) is defined as a procedure Itp^ 
(shown in Fig. [3]) that, given A, B, a refutation R of A A B and a labeling L, 
outputs a partial interpolant Ia.l{C) = Hpl(A \ B)(C) for any clause C in 
i?,; this depends on the clause being in A or B (if leaf) and on the label of the 
pivot associated with the resolution step (if inner node). Ia,l — Itpt(A \ B) 
represents the interpolant for A A B, that is Itp^A \ B)(±). We omit the 
parameters whenever clear from the context. 

In Table C [ a denotes the restriction of a clause C to the variables with 
label a. p : a indicates that variable p has label a. By C[I] we represent that 
clause C has a partial interpolant I. I + , I~ and I are the partial interpolants 
respectively associated with the two antecedents and the resolvent of a resolution 
step: /+ = Itp L {C+Wp), I- = Itp L {C- Vp), I = Itp L {C+ V C~). 



An operator U allows to determine the label of a pivot p, taking into ac- 
count that p might have different labels a and /3 in the two antecedents: U is 
idempotent, symmetric and defined by a U b — ab, a U ab — ab, b U ab = ab. 

The systems corresponding to McMillan and Pudlak's interpolation algo- 
rithms are referred to as Itpu and Itpp; the system dual to McMillan's is ItpM 1 ■ 
Itpui Upp and ItpM' are obtained as special cases of ItpL by labeling all the 
occurrences of AB variables with b, ab and a, respectively (see [I] and [T7]). 

A total order < is defined over labels as b ^ ab < a, and pointwise extended 
to a partial order over labelings: L < L' if, for every clause C and variable 
p in C, L(p,C) ^ L'(p,C). This allows the authors to directly compare the 
logical strength of the interpolants produced by two systems. In fact, for any 
refutation R of a formula A A B and labelings L, V such that L < L', we 
have Itph{A,B,R) =>• Itpi,i(A, B, R) and we say that Itp^ is stronger than 

itpu i 

Since a labeled system ItpL is uniquely determined by the labeling L, when 
discussing a family of LISs {Itpi ll , . . . , ItpL n } we will refer to the correspondent 
family of labelings as {L\, . . . , L n }. 

Labeling Notation. In the previous sections, we saw how the various collectives 
involve the generation of multiple interpolants from a single inconsistent formula 
<P = {01, ... , <j> n } for different subdivisions of into an A and a B parts; we 
refer to these ways of splitting <P as configurations. Remember that a labeling 
L has freedom in assigning labels only to occurrences of variables of class AB; 
each configuration identifies these variables. 

Since we deal with several configurations at a time, it is useful to separate the 
variables into partitions of <!> depending on whether the variables are local to a 
4>i or shared, taking into account all possible combinations. For example, TablcfT] 
is the labeling table that characterizes 3-SA. Recall that in 3-SA we are given an 
inconsistent <P = {<fii, (f>2, 4>z\ and a family of labelings {Li, L2, L3} and generate 
three interpolants I c /, 1 ,l 1 , I<t> 2 ,L 2 i I<t>?,.L3- The labeling Li is associated with the 
ith configuration. For example, the table shows that L\ can independently assign 
a label from {a, 6, ab} to each occurrence of each variable shared between (f>i and 
02, 4>\ and 03 or 0i,02 and 3 . 

When talking about an occurrence of a variable p in a certain partition 
0ii ' ' ' 0ifc > ^ i s convenient to associate to p and the partition a labeling vector 
(77^ , . . . ,r)i k ), representing the labels assigned to p by , . . . , Li k in configura- 
tion ii, . . . , ik (all other labels are fixed). Strength of labeling vectors is compared 
pointwise, extending the linear order b ^ ab ^ a as described earlier. 

We reduce the problem of deciding whether a family T — {Itpi Jl , . . . , ItpL n } 
has an interpolation property P to showing that all labeling vectors of {L%, . . . , L n } 
satisfy a certain set of labeling constraints. For simplicity of presentation, in the 
rest of the paper we assume that all occurrences of a variable are labeled uni- 
formly. The extension to differently labeled occurrences is straightforward. 
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01 02 03 


02 0103 


03 0102 




01 0203 


02 0103 


0102 03 


01 


A, a 


B,b 


B,b 




0i 


A, a 


B,b 


A, a 


02 


B,b 


A, a 


B,b 




02 


B,b 


A, a 


4, a 


03 


B,b 


B,b 


A, a 




03 


B,b 


B,b 


B,b 


0102 


AB,ai 
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02 03 
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AB, 71 
AB,£i 


B,b 
AB,5 2 


AB,73 




Table 1: 3-SA. 
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BGSA. 





Collectives of LISs Families. We derive in the following both necessary and 
sufficient conditions for the collectives to hold in the context of LISs families. 
The practical significance of our results is to identify which LISs satisfy which 
collectives. In particular, for the first time, we show that not all LISs identified 
by D'Silva et al. satisfy all collectives. This work provides an essential guide 
for using interpolant strength results when collectives are required (such as in 
Upgrade Checking). 

We proceed as follows. First, we identify necessary and sufficient labeling 
constraints to characterize BGSA. Second, we extend them to n-GSA and to 
n-SA. Third, we exploit the connections between BGSA and n-GSA on one side, 
and n-STI and T-TI on the other (Theorem 0] Lemma [5J Lemma O to derive 
the labeling constraints both for n-STI and T-TI, thus completing the picture. 
BGSA. Let <P = {01,027 03} be an unsatisfiable formula in CNF, and T = 
{ItpL 1 , ItpL 2 , ItpL 3 } a family of LISs. We want to identify the restrictions on the 
labeling vectors of {Li, L 2 , L 3 } for which T has BGSA, i.e., I<f, 1 ,L 1 A I<f, 2: L 2 =^ 
I(j>i4>2,L3- We define a set of BGSA constraints CCbgsa on labelings as follows. 
A family of labelings {Li, L 2 , L 3 } satisfies CCbgsa iff: 

(ai,a 2 ), (61,62) di {(ab, ab), (b, a), (a, b)}, f3 2 d ft, 7i d 73, ^1 d ^3, 6 2 ^ S 3 

hold for all variables, where ai, ft 7, and Si are as shown in TableEl the labeling 
table for BGSA. * < {*%, * 2 } denotes that * < *i or * < * 2 (both can be true). 

We aim to prove that CCbgsa is necessary and sufficient for a family of LISs 
to have BGSA. On one hand, we claim that, if {L\, L 2 , L 3 } satisfies CCbgsa, 
then {ItpL 1 , ItpL 2 , ItpL 3 } has BGSA. It is sufficient to prove the thesis for a set 
of restricted BGSA constraints CC BGSA , defined as follows: 

(ai,a 2 ), (S 1 ,8 2 ) e {(ab,ab), (b,a), (a,b)},(3 2 = ft, 71 = 73, S 3 = max{<$i, <5 2 } 

Lemma 1. If {Li, L 2 , L 3 } satisfies CC B gsa> then {ItpL 1 , Itp^ 2 , ItpL 3 } has BGSA. 

The CC BGSA constraints can be relaxed to CCbgsa as shown in [T7] (Theo- 
rem 2, Lemma 3), due to the connection between partial order on labelings and 
LISs and strength of the generated interpolants. For example, the constraint 
S 3 = max(5i, S 2 ) can be relaxed to #3 ^61, S3 ^ S 2 . This leads to: 

Corollary 2. If {Li, L 2 , L 3 } satisfies CCbgsa, then {Itpi Jl , Itp^ 2 , Itp^ 3 } has BGSA. 

On the other hand, it holds that the satisfaction of the CCbgsa constraints is 
necessary for BGSA: 



Lemma 2. If {ItpL 1} ItpL 2 , ItpL 3 } has BGSA, then {Li, L2, L$} satisfies CCbgsa- 
Having proved that CCbgsa is both sufficient and necessary, we conclude: 

Theorem 9. A family {ItpL 1 , ltp^ 2 , ItpL 3 } has BGSA if and only if {L%, L2, L3} 
satisfies CCbgsa- 

n-GSA. After addressing the binary case, we move to defining necessary and 
sufficient conditions for n-GSA. A family of LISs {Itpj Jl , . . . , ItpL n+1 } has n-GSA 
if, for any <P = {(f>i, . . . ,<t> n +i}, I$ lt Li A • • • A I^ nt L n => ^ 1 ...0„,l„ +1 , provided 
<P is inconsistent. As we defined a set of labeling constraints for BGSA, we now 
introduce n-GSA constraints (CCugsa) on a family of labelings {Li, . . . , L n+ i}; 
for every variable with labeling vector (a^, . . . , ccj fc+1 ), 1 < k < n, letting m = 
ik+i if ik+i ^ n+ 1, m = ik otherwise: 

(1) (3j 6 {ii, i m } eij = a) (V/i 6 {i lt i m } j a h = b) 

(2) Moreover, if i k+1 = n + 1 : V j e {ii, . . . , ik}, a 3 ■< a ik+1 

That is, if a variable is not shared with <f> n +i, then, if one of the labels is a, 
all the others must be b; if the variable is shared with (f> n +i, condition (1) still 
holds for (ai 1 , . . . ,oti k _ 1 ), and all these labels must be stronger or equal than 
ai k+1 = ctn+i- We can prove that these constraints are necessary and sufficient 
for a family of LIS to have n-GSA: 

Theorem 10. A family F = {Itpr, 1 , . . . ,ItpL n+1 } has n-GSA if and only if 
{L x , . . . , L„+i} satisfies CC nG sA- 

In [T7] (see Setting 1) it is proved that n-SA holds for any family of LISs stronger 
than Pudlak. Theorem [TU] is strictly more general, since it allows for tuples of 
labels (e.g., (ai,ct2) = (a, 6) or (61,63,62) — (a, 6,6)) that were not considered 
in [T7]. The constraints for n-SA follow as a special case of CC n GSA- 

Corollary 3. A family J- = {ItpL 1 , . . . , Itp^ n } has n-SA if and only if {L\, . . . , L n } 
satisfies the following constraints: for every variable with labeling vector (a^ , . . . , on h ), 
for 2 < k < n: (3j E {h, . . . ,i k } ay = a) => (V/i S {ii, . . . , ik} h ^ j an = b). 

Moreover, a family that has (n + 1)-SA also has n-GSA if the last member of the 
family is Pudlak's system. In fact, from Proposition[2]and Pudlak's system being 
symmetric (as shown in [5]), it follows that if a family {ItpL 1 , . . . , ItpL n , Itpp} 
has (n + 1)-SA, then it has n-GSA. 

After investigating n-GSA and n-SA, we address two questions which were 
left open in §21 do n-SA and n-PI imply n-STI? Is the requirement of additional 
interpolation systems necessary to obtain T-TI from n-GSA? We show here that 
n-SA and n-PI do not necessarily imply n-STI, and that, for LISs, n-GSA and 
T-TI are equivalent. 

n-STI. Theorem [3] shows that if a family has n-STI, then it has both n-SA and 
n-PI. We prove that the converse is not necessarily true. First, it is not difficult 
to show that any family {Itp Lo , Itp Ll , Itp L2 } has 2-PI (Proposition [3] in the 
appendix); a second result is that: 



Lemma 5. There exists a family {Itpi, , ItpL x , ItpL 2 } that has 2-PI and a fam- 
ily {Itp^ , Itp L > 2 } that has 2-SA, but the family {Itp Lo , Itp Ll , Itp L2 , ltp L ^ , Itp L < 2 } 
does not have 2-STI. 

We obtain the main result applying the STI sub- family property (Theorem [5]): 

Theorem 11. There exists a family {Itps , ■ ■ . , Itps n } that has n-PI, and a 
family {ItpT ly ■ ■ ■ , ItpT„} that has n-SA, but the family {Itps , ■ ■ . , Itps n }U 
{ItpT ± , ■ ■ ■ , ItpT n } does not have n-STI. 

T-TI. The last collective to be studied is T-TI. Theorem © shows how T-TI 
can be obtained by multiple applications of GSA at the level of each parent 
and its children, provided that we can find an appropriate labeling to generate 
an interpolant for the parent. We prove here that, in the case of LISs, this 
requirement is not needed, and derive explicit constraints on labelings for T-TI. 

Let us define n-GSA strengthening any property derived from n-GSA by not 
abstracting any of the subformulae tfti, for example I<f >1 .Li A ... A I$ n _ lt L n -i A 
(f> n => I<f> 1 ... < i> n ,L n+ xl it can be proved that: 

Lemma 6. The set of labeling constraints of any n-GSA strengthening is a sub- 
set of constraints of n-GSA. 

From Theorem |5] and Lemma O it follows that: 

Lemma 7. Given a tree T = (V, E) a family {Itps^i^v has T-TI if for every 
parent ik+i and its children i\, . . . , ik, the family of labelings of the (k + 1)-GSA 
strengthening obtained by non abstracting the parent satisfies the correspondent 
subset of (k + 1)-GSA constraints. 

Note that, in contrast to Theorem [§1 in the case of LISs we do not need to 
ensure the existence of an additional set of interpolation systems to abstract the 
parents. The symmetry between the necessary and sufficient conditions given by 
Theorem IHl and Theorem [5] is restored, and we establish: 

Theorem 12. Given a tree T — (V,E) a family {Itps^i^-v has T-TI if and 
only if for every parent ik+i and its children i%, . . . , ik, the family of labelings of 
the (k + l)-GSA strengthening obtained by non abstracting the parent satisfies 
the correspondent subset of (k + 1)-GSA constraints. 

Alternatively in the case of LISs, the additional interpolation systems can be 
constructed explicitly: 

Theorem 13. Any T — {Itp^^, . . . ,ItpL ikl ItpL n+1 } s.t. k < n that has an 
n-GSA strengthening property can be extended to a family that has n-GSA. 

Collectives of Single LISs. In the following, we highlight the fundamental re- 
sults in the context of single LISs, which represent the most common application 
of the framework of D'Silva et al. to model checking. 

First, surprisingly (and importantly for practical applications), any LIS sat- 
isfies PI: 

Theorem 14. PI holds for all single LISs. 



Second, recall that in §3]we proved that BGSA, STI, TI, GSA are equivalent 
for single interpolation systems, and that SA — > BGSA for symmetric ones. We 
now show that for a single LIS, SA is equivalent to BGSA and that PI is not. 

Theorem 15. If a LIS has SA, then it has BGSA. 

Proof. We show that, for any L, the labeling constraints of SA imply those of 
BGSA. Refer to Table dj Table Q] Theorem [TO] and Corollary In case of a 
family {Li, L2, L3}, the constraints for 3-SA are: 

("1,0:2), ($2,03), (71,73) r< {(ab,ab), (b,a), (a,b)} 
(61,82, S3) ^ {(ab, ab, ab), (a, b, b), (b, a, b), (b, b, a)} 

When L\ = L2 = L3, they simplify to a, 0,^,6 6 {ab, b}; this means that, in 
case of a single LIS, only Pudlak's or stronger systems are allowed. In case of a 
family {L\, L2, L3}, the constraints for BGSA are: 

(ai,a 2 ), (61,62) di {(ab, ab), (b, a), (a, b)}, 2 d 03, 7i d 73, 61 ^ 5 3 , 6 2 d ^3 

When Li — L2 = £3, they simplify to a, 6 G {ab, b}; clearly, the constraints for 
3-SA imply those for BGSA, but not vice versa. 

Finally, Theorem [HI and Theorem [TO1 yield: 

Theorem 16. The system ItpM' has PI but does not have BGSA. 
Proof. From the proof of Theorem [TO} a LIS has the BGSA property iff it is 
stronger or equal than Pudlak's system. ItpM' is strictly weaker than Itpp. 
Thus, it does not have BGSA. 

Note that the necessary and sufficient conditions for LISs to support each 
of the collectives simplify implementing procedures with a given property, or, 
more importantly from a practical perspective, determine which implementation 
supports which property. 

5 Conclusions 

Craig interpolation is a widely used approach in abstraction-based model check- 
ing. Several interpolation systems have been developed and employed for bit-level 
verification, one of the most popular applications of software model checking and 
the main application in hardware model checking. This paper conducts a system- 
atic investigation of verification-related requirements for interpolants and as a 
result correlates the individual interpolation systems and compares the necessary 
verification-specific conditions. 

The paper makes the following contributions to the state-of-the-art of interpo- 
lation-based verification. It systematizes and unifies various properties imposed 
on interpolation by existing verification approaches and proves that for families 
of interpolation systems the properties form a hierarchy, whereas for a single sys- 
tem all properties except path-interpolation and simultaneous abstraction are in 
fact equivalent. Additionally, it defines and proves both sufficient and necessary 
conditions for a family of labeled interpolation systems. In particular, it demon- 
strates that in case of a single system path-interpolation is common to all LISs, 
while simultaneous abstraction is as strong as all other (more complex) proper- 
ties. Extending our framework to address interpolation in first order theories is 



an interesting open problem, and will be part of our future work. 
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A Properties of Sub-families 



Theorem 2. A family {Itps , . . . , Itps„ , HpTi > • • • > Itpr n } ^ as n ~STI iff for all 
k < n the subfamily {Itps , • ■ ■ , Hps k } U {Itpr 1 , ■ ■ ■ , Hpr k } has k-STI. 

Proof. — y) Assume an inconsistent <P = {cj>x, ■ ■ ■ , 4>k} ■ We can extend it to a 
= . . . , 4>' n } such that 0- = (f>i, by adding n — k empty formulae T. If T 
has the n-STI property, for < j < k — 1 

<— ) Follows from k = n. 

Theorem 17. ^4 family T — {Itps 11 ■ ■ ■ , Hps n+1 } has n-GSA iff for all k < n 
all the subfamilies {Itps it , ■ ■ ■ , Itps ik+i } have k-GSA. 

Proof. (— ») Let n be a natural number. Take any inconsistent = {(j>x, . . . , 4>k+i } 
such that k < n. Let {ii, . . . , ifc+i} be a subset of {1, . . . , n + 1}. Extend <P to a 
<£' = {</>!,..., </4 +1 } by adding (n-fc) copies of T, so that <^ = . . . , ^ = 
<^ fc+i = </>,i+i. Since J 7 has n-GSA: 

n 

and, since <^ = T for j ^ {ii, . . . , if-}- 

je{ii—ife} 

(■<— ) Follows from k = n. 

It is easy to see that the technique used in the proof of Theorem [T71 i.e., 
extending an unsatisfiable formula with T conjuncts, applies to the other prop- 
erties as well. 

Theorem 18. A family {Itps ± , ■ ■ ■ ,Itps n } has n-SA iff for all k < n all the 
subfamilies {Itps^ , • ■ ■ , Hps, } have k-SA. 

Proof. The proof works as in Theorem 1171 

Theorem 19. A family {Itps , ■ ■ ■ ,Itps„} has n-PI iff for all k < n the sub- 
family {Itps a , • ■ • , Itps k } has k-PI. 

Proof. The proof works as in Theorem [2J 

Theorem 20. For a given tree T — (V,E), a family {Itps^i^v has T-TI iff 
for every subtree T' = (V',E') of T , the family {ItpsAjzv 1 has T'-TI. 



Proof. —7-). Assume an inconsistent <P = , . . . , <f) ik } decorating T". We can 
extend <P with \V'\ — \V\ empty formulae T to <P' = {fa, . . . , <f)' n } decorating T. 
If {Itps^viev has the T-TI property, for all v[ in V and in particular for all Vi 
in V 

(vices' 

<-). Follows from T' = T. 
B Other Proofs 

Proposition 1. 5^4 implies BGSA in symmetric interpolation systems. 

Proof. Take any inconsistent $ = {fa, fa, fa}- If an interpolation system has 
SA, then: 

I<t>i A I<t>2 A ^3 - 1 - 

Equivalcntly, 

^i A ^2 ^>3 

For a symmetric system, = Ifafa. 

Proposition 2. If a family J 7 = {Itps 1 , ■ ■ ■ ,Itps n+1 } has (n + l)-SA andltps n+ 
is symmetric, then T has n-GSA. 

Proof. Take any inconsistent & = {fa, . . . , fa}. Since T has (n + 1)-SA, then 
I^uSi A • • • A Ij, n+1 ,s n+1 => -L. Assuming Itp Sn+1 is symmetric, 7^ n+1 ,s n+1 = 
!((>!,. ..,4> n ,S n+1 and the thesis is proved. 

Theorem 3. If a family T = {Itps , • ■ • , Itps n , ItpT ± , ■ ■ ■ , Itpr n } has n-STI 
then (1) {Itps a , . . . , Itps n } has n-PI and (2) {ItpT 17 ■ ■ ■ , Itpr n } has n-SA. 

Proof. (1) It follows from fa I^^Si f° r every i. 

(2). Take any inconsistent <P — {fa, . . . , fa}. If J 7 has n-STI, then, for < i < 
n-1: 

I<l>i — <t>i,Si A I<f> i + 1 ,T i+ i ^0i---0 i + i,Si + i 

Since Ifc...^ = -L, we get /^ 1)Tl A • • • A /0„,t„ => J- 

Theorem 4. A family T = {Itps , • • ■ , Itps n , -ftp-Ti , ■ ■ ■ , ItpT n } has n-STI iff 
{Itps^ ItpT i+1 , Itps i+1 } has BGSA for all < i < n — 1. 

Proof. (—>•). Take any inconsistent = {</>i, </>2, (fo}. For < ? < n — 1, extend <I> 
toa$' = . . . , fa n } by adding (n — 3) copies of T, so that fa — fa, fa +1 — fa, 
fa +2 = 03- Since T has n-STI: 



Hence, by construction: 



I<t>!,Si A I</> 2 ,T i+1 =>■ IfafctSu 



(<—) Take any inconsistent <P — {(pi, . . . ,(p n }- Since {Itps i7 Itpr i+1 , Hps i+1 } has 
BGSA, it follows that for {(p[, (p' 2 , (p' 3 }, where (p[ = (pi A • • • A (pi, (f>' 2 = (pi+i, 
03 = 4>i+2 A • • • A 4> n : 

h'^Si A/^,T i+1 V 1 02" S i+i 

Hence, by construction: 

I<j>i...cj>i,Si A I<fr i + 1 ,T i+1 ^ I(f> 1 ...(t)i + 1 ,Si + i 

Theorem 5. Given a tree T = (V,E) if a family T = {ItpSi}iev has T-TI, 
then, for every parent ik+i and its children i\, . . . 

1. If ik+i is the root, {Itps H , • ■ • , Hps ik } has k-SA. 

2. Otherwise, {Itps ix , . . . , Hps ik , ^Ps ik+1 } has k-GSA. 

Proof. Take any inconsistent <P = {(f>i 1 , . . . , <j>i k+1 }. Consider a parent and 
its children ii, . . . , ife. If ik+i is not the root, extend <P to a in such a way 
that: the children are decorated with (pi 1 , . . . , <pi k , all their descendants and iu+i 
with T, all the nodes external to the subtree rooted in ik+i with (f> n+ i. Since T 
has T-TI, then at node ik+i'- 

A A& fc+1 ^ +1 ,S ife+1 

(*fc+i,j)eB 

that is: 

/\ ^„Si A T => Icf>i 1 ---(pi k ,s k+1 

ie{ii...i k } 

If ifc+i is the root, the proof simply ignores the presence of 4>i k+1 and Si k+1 . 

Theorem 6. Given a tree T = (V,E), a family T = {Itps^iev has T-TI if, 
for every node ik+i and its children ii, . . . , iu, there exists T ik+1 such that: 

1. If i k+ i is the root, {Itp Sil , • • • , Hps ik , It PT, k+1 } has (k + 1)-SA. 

2. Otherwise, {Itp Sil Itp Ttk+1 , It Ps ik+1 } has (k + V)-GSA. 

Proof. Take any inconsistent <P = {(pi, . . . , </>„}. Consider a parent ik+i different 
from the root and its children ii, . . . ,i k . 

If {Itp Sil Itp Tik+1 , Hps, k+1 } has fc-GSA, for {F h , . . . , F lk , <p ik+l , # \ (U F if U 

A Ip i' S i Al 4>i k+1 ,T ik+1 => J F ik+1 ,S ik+1 
ie{ii...i k } 

The thesis follows since (pi k+1 => ^> 4fc , T i k+1 ■ ^ ^ k + 1 * s * ne root, lF ik+1 ,s ik+1 = 
_L and 5*i fc+1 is superfluous. 



<- 



n + 1 



n+ 1 



n + 2 <- 



2n 
I 



Figure 4: T£ SA . Figure 5: Tg TI . 

Theorem 7. If a family T — {Itps n+1 , ItpSi, • ■ ■ , ItPs n +i} has Tq SA -TI, then 
{Itps 1 , • ■ ■ , Itps n+1 } has n-GSA. 

Proof. Let Tg SA = (V, E) be the tree shown in Fig. H where V = {0, . . . , n+ 1} 
and E = {(0,i) | 1 < i < n}U {(n+1,0)}. 

Take any inconsistent <P = {</>i, . . . , n +i}. We decorate node with T, all 
other nodes i with fy, for 1 < i < n + 1. Since J 7 has T-TI, then at node 0: 

Hence, by construction: 

71 

/\hi,Si Icf>x...<t>n,Sn+l 

i=l 

Theorem 8. J/ a family T = {Itps , ■ ■ ■ , ^ips„ } U {ItpT x , ■ ■ ■ , ItPT„ } has Tg TI - 
TI, then it has n-STI. 

Proof. Let Tg TI = (V, E) be the tree shown in Fig. where V = {1, . . . , 2n} 
and £ = {(n + i,i) | 1 < i < n} U {(n + i, rt + i — 1) | 1 < i < u}. 

Take any inconsistent = . . . ,4>n}- For 1 < i < n, we decorate i with 
^i, n + i with T; similarly we associate i with ItpTi and n + i with Itpg^ Since 
T has T-TI, then at every node n + i + 1, for < i < n — 1: 

Hence, by construction, 

Lemma 1. If {L\, L2, L3} satisfies CCg GSA , then {ItpL 1 , ItpL 2 , HpL 3 } has BGSA. 

Proof (by structural induction). We remind here the restricted BGSA constraints 
CCg GSA : 

(ai,a 2 ), (Si, §2) £ {(a6,a6), (6, a), (a, 6)},/3 2 = /3 3 ,7i = 73,^3 = max{6x,d 2 } 

The reader can verify that the conditions on the Si are equivalent to (Si, S2, S3) £ 
{(ab, ab, ab), (b, a, a), (a, b, a)}. 

We show that, given a refutation of for any clause C in the refutation 
the partial interpolants satisfy / < ^ ll i 1 (C) A ^0 2 ,l 2 (C') Ic/>i4> 2 ,l 3 (C), that is 

I^ulAC) A /0 2 ,l 2 (C) A Ifafa^siC) => J- 



For simplicity, we write I\ , I2 , ^3 to refer to the three partial interpolants for 
C and, if C has antecedents, we denote their partial interpolants with 1^ , I2, 
l£ and If, AT, 1$. 

Base case (leaf). Case splitting on C (refer to Tabled]): 



C £ 0i : h = C{ hb I 2 = C[ 2 , a h = C[ 



3.6 



C e 2 : Ji =Ch_« I 2 = Ckb J3 = C[ 3 . b 
C e 03 : /1 = CU, a 7 2 = CL 2 , a h = Ci s , a 

The goal is to show that in each case I\ A I2 A J3 =>■ _L . Representing C by 
grouping variables into the different partitions, with overbraces to show the label 
assigned to each variable, we have: 



Cg 0i 



71 5i 



b 0L2 b 62 



C|.2,a — C c /> 1 [a A C c f >1 ^ 2 [ a A C<f )l< f >3 [ a a. 
a a 73 03 



C[3,b — C$ x [b A C0 1( /, 2 Lb A C^^j tb A tb 
Ce0 2 : 

b ai b (5i 



CLl.a — C</>2 La A C c f >1( f >2 [ a A C c f, 2C f >3 [ a A Cfafofo [ a 
a OL2 82 02 

Ch,&= CV 2 U V C^!^ tb V C0 2 03 tb V Cfafofa [b 

a a 8 3 S3 



CU,6 — C</>2 Lb A C^!^ U A Cfafalb A Cfafafa Lb 
CG03 : 

b b 71 5i 



CLl.a — C^ 3 |. a A C0203 La A C c f >ic f >3 [ a A Cfafofo [ a 
b 82 b S 2 



C\.2,a — C^g ). a A C^. 2( j >3 [ a A C ( j Jl( f )3 [ a A C <f> 1 (j> 2 <j> 3 La 
b ft 73 °3 

C|.3,a= C0 3 |.a V C020 3 La V Cfafa [ a V C ' ^fofc [a 



We can carry out some simplifications, due to the equality constraints in CCg GSA 
and the fact that variables with label a restricted w.r.t. b (and vice versa) are 
removed, leading (with the help of the resolution rule) to the constraints: 



(C<fil<l>2 Lb V C r j )1< p 2C f >3 [b) A C l j >1 ^ 2 [ a A C < 1 0203 La A C,j >ll j >2l j >3 [b — > -L 



a 2 s 2 Q 2 s 2 s 2 

C0102 U A to A(C , 1 2 U V C^^cfeU) A tfe _L 

*i 62 5 3 



C<^10203 la A C<£i02</>3 [ a A C^j 2 03 [a ^ -L 

Finally, the constraints on (ai, 02) and (81, 62, S3) guarantee that the remaining 
variables are simplified away, proving the base case. 

Inductive step (inner node). The inductive hypothesis (i.h.) consists of Jf A 
I2 A J3" => 1, A / 2 ~ A / 3 ~ =>• _L. Wc do a case splitting on the pivot p: 
Case 1 (p in <^>i). 



(7+ V 7f) A (7+ A J 2 ") A (7+ V 7 3 ") 

(7+ V 7f) A 7+ A 7- A 7+ A If => 
(7+ A 7+ A 7+) V (7f A 7 2 " A if) => ; - h a 

Case 2 (p in ^3). 

7i A 7 2 A h <^=> 



(7+ A 7f) A (7+ A 7f ) A (7+ A 7f) 

7+ A Jf A 7+ A 7- A (if V If) => 
(if A 7+ A 7+) V (Jf A Jf A if) => i h ± 
Case 3 (p in fafafe) ■ If (^1,^2,^3) = (ab,ab,ab): 

h A 7 2 A h 



(7+ V p) A (7f V p) A (7+ V p) A (7f V p) A ((7 3 + V p) A (7f V p)) 

(7+ Vp) A (7f Vp) A (7+ Vp) A (7f V p) A ((7+ Ap) V (if Ap)) 

((7+ V p) A (7+ V p) A if A p) V ((7f V p) A (7f Vp) A if A p) rcso1 

(7+ A 7+ A if) V (7f A 7f A if) => Lh '± 

All the remaining cases arc treated in a similar manner, to reach a point (possibly 
after a resolution step if some of the labels are ab) where the inductive hypothesis 
can be applied. 

Lemma 2. If {ItpL 1 , ItpL 2 , ItpL 3 } has BGSA, then {Li, L 2 , L 3 } satisfies CCbgs a- 

Proof (by contradiction). We remind here the BGSA constraints CCbgs A- 

(ai,a 2 ), {61,62) r< {(ab,ab), (b, a), (a,b)},/3 2 r< (83,71 ^ 73,<5i r< ^3,^2 r< ^3 

We show that, if any of the CCbgs A constraints is violated, there exist an un- 
satisfiablc formula <P = {</>i, <f) 2 , <j> 3 } and a refutation such that I < p 1 x 1 A J0 2 ,l 2 
The possible violations for the CCbgs A constraints consist of: 



1. (ai, a 2 ), J 2 ) e {(a, a), (ab, a), (a, ab)} 

2. (71, 73), * 3 ), (fc, *s) G {(«, ab), («, 6), (oft, 6)} 

It is sufficient to take into account (0:1,0:2) G {(a, a), (a, ab)} and (,$2, ^3) € 
{(a, ab), (a, 6), (ab, b)}. The remaining cases follow by symmetry. 

(1) (oi,o 2 ) = (a, a) : 0i = (p Vg) Ar, 2 = (pVf) A g, 3 = s 

A = 4>1 B = (j)2,<t>3 

pVg[l] p V r [p A r] 

gVf [pAr] r [_!_] 

9 b A r] g[g] 

J_[(pAr)Vg] 

A = <f>2 B = <f> 1 ,<j>3 

p V g [p A g] pVf [1] 

g V r [p A q] r [r ] 

g[(pAg)Vr] g[±] 
J_[(pAg) Vr] 

We have = (pAr)Vf, ^ 2 ,l 2 = (p A 9) V r, I^<t, 2 ,L 3 = -L since s is 

absent from the proof. Then, /^Li A /0 2 ,l 2 =7^ I<Pi<f>2,L 3 '- a counter model 
is g, r. 

(2) (oi, o 2 ) = (a, ab) : 0i = (p V g) A r, 2 = (p V r) A g, 3 = s 

A = 4> 2 5 = 01,03 

p V g [T] pVr[l] 

gVr[p] ~ q[±] 

r[(pVg) Ag] r [T] 

J_[((pVg)Ag) Vr] 

We have /0 1: Li = (p A r) V g and I^ 2 ,l 3 = _L as in (1), while /0 2i l 2 = 
((pVg) Ag)Vr. Then, /0^ Ll A/0 2;L2 =7^ I^fo,^'- a counter model is g,r. 

(3) (/3 2 ,/3 3 ) - (a,b) : 0i = s, 2 = (pVr) A g, 3 = (pVg) Ar 

-A = 01, 02 B = 03 

p V g [T] pVr[pVf] 

g V r [p V r] r [T] 

g [P v r] q[q] 

J_[(pVr) Ag] 



We have = T, since s is absent from the proof, while /0 2 ,l 2 = (pAg)Vr 

as in (1); I^ 2 ,l 3 = (pVf) Aq. Then, i,^,.^ A/^ 2 ,l 2 I<i>ifa,L 3 '- a counter 

model is g, r. 

(4) (/3 2 ,/3 3 ) = (a, oft) : 0i = s, 2 = (pVrjAg, 3 = (pVg)Ar 

-4 = 01,02 -8 = 03 

p V g [T] pV?[l] 

g V r [p] r [T] 

g[pVr] g[±] 
l[(pVrVg)Ag] 

= T as in (3), / 02! l 2 = (p Aq) Vr as in (1), J^ 2 ,z, 3 = (pVrVq) Aq. 
Then, I^,^ A /</, 2 ,i, 2 =^ I<t>i<t> 2 ,L 3 - a counter model is g,r. 

(5) G8 2 ,A0 = (oM) : 01 =s, 02 = (pVf)Ag, (f> 3 = ( P V?)Ar 

7^,1,! = T as in (3), / 02 ,l 2 = {(p V q) A q) V r as in (2), IfafaLa = (pVr)Ag 
as in (3). Then, I^Li A /0 2 ,l 2 I<t>i4>2,L 3 - a counter model is g,r. 

Lemma 3. If{L\, . . . , L„+i} satisfies CC n csA, then the family {ItpL 1 , • • • , 7ip£ 
has n-GSA. 

Proof (by structural induction). We assume that the CC u gsa constraints have 
been restricted in a similar manner to what shown in CCg GSA . We prove that, 
given a refutation of for any clause C in the refutation the partial intcrpolants 
satisfy J 01 M (C) A ... A >L n (C) I < t >1 ...4> n ,L n+1 (C), that is I (t>uLl (C) A . . . A 

I<l>n,L„(C) A /</>!. ..0„,L„ + i(C) =>■ _L. 

Base case (leaf). Remember that, if C G <fii,i ^ n + 1, C has class A in 
configuration i (hence the partial interpolant is CU.b) and in configuration n + 1 
(C[ n +i,t) and class B in all the other configurations j ^ i,n + 1 (C|_j, a ). If 
C G 0n+i, it has class i? in all configurations (C[ n +i, a in configuration n + 1, 
C[i^ a everywhere else). So we need to prove: 

C[i, a A ... A CU-!.a A CU,b ACU+i,„ A ... A C[ n , a A C*U+M X 

CLi7 A ... A CU-i, a A C[~ A CU +ha A ... A C[^~ a A C[ n+1 , a => X 

respectively for i ^ n + 1 and i = n + 1. 

We can divide the variables of C G 0, into partitions, obtaining C = C ( f >i V 
Cfafa V ... V C < p 1 ... < p n , leading to a system of constraints as shown for BGSA; the 
conjunction of: 

{C<j>i V Q> s </>2 V ... V C <t>1 ... <t>n )[i. a 



(C , i V C ( f, iC j 12 V ... V C ( f, 1 ...^ n )li i b 



n .a 



must imply _L for every fa, i ^ n+ 1 (similarly for </>„ + i). All the simplifications 
are carried out in line with the proof of Lemma [TJ 

Inductive step (inner node) . The proof is a again a direct generalization of the 
proof of Lemma [TJ 

Performing a case splitting on the pivot and on its labeling vector, the starting 
point is a conjunction of the partial interpolants ii A . . . A I n A I n +i of C, which 
is then expressed in terms of the partial interpolants for the antecedents. The 
goal is to reach a formula ip = (if A ... A i+ A I* +l ) V (if A ... A if A if+f 
where the inductive hypothesis can be applied. 

The key observation is that the restricted CC u gsa constraints give rise to a 
combination of boolean operators (after the dualization of the ones in I n +i due 
to the negation) which makes it always possible to obtain the desired ip, possibly 
with the help of the resolution rule. 

Lemma 4. If a family J- = {Itp^ , . . . , itpL„ +1 } has n-GSA, then {Li, . . . , L n +i} 
satisfies CC n GSA- 

Proof (by induction and contradiction). We prove the theorem by strong induc- 
tion on n > 2. 

Base Case (n = 2). Follows by Lemma [2J 

Inductive Step. Assume the thesis holds for all k < n — 1, we prove it for 
k = n. By Lemma [T71 if a family J- = {ItpL 1 , . . . , ItpL n+1 } has n-GSA, then any 
subfamily of size k+ 1 < n has fc-GSA. Combined with the inductive hypothesis, 
this implies that it is sufficient to establish the theorem for every variable p 
and labeling vectors a — (ai, . . . , a n ) and f3 — (/3x, ■ ■ ■ , fi n +\) corresponding to 
partitions <f>\ - ■ ■ <fi n and (fix ■ • ■ 4> n +i, respectively. 

We only show the case of a. The proof for f3 is analogous. W.l.o.g., assume 
that there is a p such that a violates CC n csA for a± = a-i = a (other cases are 
symmetric). Construct a family of labelings {Lf L' 2 , L' n+1 } from . . . , L n+ i} 
by (1) taking all labelings of partitions involving only subsets of <fii, 02 and <fi n +i- 
For example, vectors (773,774) and (771,772,773,7771+1) would be discarded, while 
(t?i j 772) and (771, 772, 77„+i) would be kept; and (2) for p, set the labeling vector of 
partition (f>i4>2 to (0:1,0:2) = (a, a). By LemmaO {L[, L' 2 , L' n+1 } does not have 
BGSA. Let <P' = {<fi 1 ,^ 2 ,<Pn+i} be such that i^L; ^I<f> 2 ,L' 2 =ft> I^ 2 ,L' n+1 , and 
let ii be the corresponding resolution refutation. 

Construct <P — {(fix, <fi2,P, 4>n+i} by adding (n — 2) copies of p to <P' . 

<P is unsatisfiable, and ii is also a valid refutation for <P. From this point, we 
assume that all interpolants are generated from ii. 

Assume, by contradiction, that J- has n-GSA. Then, 




But, because 4>3, . . . ,<j) n do not contribute any clauses to 77, J^ 4) i 4 = T for 
3 < i < n. Hence, 

I&i, Li ^ ^(j> 2 ,L 2 '* ^4>i4> 2 ,L n + i 

However, by construction: 

which leads to a contradiction. Hence a must satisfy CC n GSA- 

Proposition 3. Any family {ItpL 0> ItpL x , ItpL 2 } has 2-PI. 

Proof. Recall that It,l = T and I$ lt f, 2t L 2 = -L for any Lq^L%. Hence, 2-PI 
reduces to the following two conditions: cj>i I<j> u Ln 7#>i,Li A <p2 ==> JL, 

which are true of any Craig interpolant. 

Corollary 4. A family {ItpL ly ItpL 2 } has 2-SA if and only if {Li, L2} satisfies 
(ai,a 2 ) d {(ab, ab), (a, b), (6, a)} 

Proof. Follows from Lemma [2] and Lemma [T] 

Lemma 5. There exists a family {Itpi, n , Itpi, 1 , ItpL 2 } that has 2-PI and a fam- 
ily {Itpi,^ , ItpL' 2 } that has 2-SA, but the family {ItpL , ItpL ± , ItpL 2 , ItPL[, ItPL' 2 } 
does not have 2-STI. 

Proof. By Theorem|H a necessary condition for 2-STI is that {Itpi J1 , Hpl' 2 > Hpl 2 } 
has BGSA. By Proposition [31 {Lq,Li,L2} can be arbitrary. By Theorem £3 
and Corollary [U there exists {i'^i?,} such that {ItpL> ,ItpL> } has 2-SA, but 
{Itpi Jll HpL' , HpL 2 } does not have BGSA. 

Lemma 6. The set of labeling constraints of any n-GSA strengthening is a sub- 
set of constraints of n-GSA. 

Proof. Assume w.l.o.g we strengthen the first subformula <f>i. Then any variable 
in any partition which does not involve </>i has the same labeling vector and its 
n-GSA labeling constraints are also the same. Instead, variables in any partition 
4>i4>i 2 ■ ■ ■ 4>ik nave now a labeling vector [pn 2 cti k ), where the first component 
a\ is missing. Referring to the definition of CC n GSA, it is easy to verify that the 
set of the constraints for the strengthening are a subset of the constraints for 
n-GSA. 

Theorem 13. Any T = {ItpL i:L , ■ ■ ■ ,ItpLi ,ItpL n+1 } s.t. k < n that has an 
n-GSA strengthening property can be extended to a family that has n-GSA. 

Proof. Refer to the definition of CC u gsa and to Lemma H>] We can complete T 
for example by introducing n — k instances of McMillan's system ItpM- Both 
constraints (1) and (2) for n-GSA are satisfied, since Itpu always assigns label b 
(recall the order b -< ab ■< a). Note that Itpu is not necessarily the only possible 
choice. 



Theorem 14. PI holds for all single LISs. 



Proof. In [T7] we addressed n-PI for a family of LISs {ItpL , . . . , ItpL n }. Given 
an inconsistent <P = {0i, . . . , <fi n }, Table [3] shows the labelings Li, Lj+i for an 
arbitrary step I^, 1 ...^,L i A^i+i I^,„4>^ i+u L i+1 (ipi = 0iA. . .A&, ip2 = <Pi+i, 
V>3 = A ... A 4> n ): 



Table 3: n-PI step. 



p in ? 


Variable class, label 


1p\ -02^3 




01 


A, a 


A, a 


02 


B,b 


A, a 


03 


B,b 


B,b 


0102 


AB, ai 


A, a 


0203 


B,b 


AB,(3 2 


0103 


AB,f\ 


AB, 72 


010203 


AB,Si 


AB,<5 2 



We identified a set of constraints for Lj, as: 

7l d 72 *i d $2 

For a single LIS, 71 = 72 and Si =62, so all constraints are trivially satisfied for 
< i < n- 1. 



